clean model
Setup in Detail
We implement our attack framework using Python 3.7.3 and PyTorch 1.7.13 that supports CUDA 11.0 for accelerating computations by using GPUs. We run our experiments on a machine equipped with Intel i5-8400 2.80GHz 6-core processors, 16 GB of RAM, and four Nvidia GTX 1080 Ti GPUs. To compute the Hessian trace, we use a virtual machine equipped with Intel E5-2686v4 2.30GHz 8-core processors, 64 GB of RAM, and an Nvidia Tesla V100 GPU. For all our attacks in 4.1, 4.2, 4.3, and 4.5, we use symmetric quantization for the weights and asymmetric quantization for the activation--a default configuration in many deep learning frameworks supporting quantization. Quantization granularity is layer-wise for both the weights and activation.
Error Correction Output Codes for Robust Neural Networks against Weight-errors: A Neural Tangent Kernel Point of View
Error correcting output code (ECOC) is a classic method that encodes binary classifiers to tackle the multi-class classification problem in decision trees and neural networks.Among ECOCs, the one-hot code has become the default choice in modern deep neural networks (DNNs) due to its simplicity in decision making. However, it suffers from a significant limitation in its ability to achieve high robust accuracy, particularly in the presence of weight errors. While recent studies have experimentally demonstrated that the non-one-hot ECOCs with multi-bits error correction ability, could be a better solution, there is a notable absence of theoretical foundations that can elucidate the relationship between codeword design, weight-error magnitude, and network characteristics, so as to provide robustness guarantees. This work is positioned to bridge this gap through the lens of neural tangent kernel (NTK). We have two important theoretical findings: 1) In clean models (without weight errors), utilizing one-hot code and non-one-hot ECOC is akin to altering decoding metrics from $l_2$ distance to Mahalanobis distance.
A Proof and Derivations
However, the underlying clean model doesn't always exist for imperfect model Theorem A.1 (Necessary and Sufficient conditions for the existence of the underlying clean model.) . This theorem is a straightforward corollary of Bochner's I. (26) We can also expand the hessian of the log q We can then prove Theorem 2.3. All the experiments conducted in this paper are run on one single NVDIA GTX 3090.
8fd7f981e10b41330b618129afcaab2d-Supplemental.pdf
In this supplemental material, we provide additional details on the theory,the algorithms, and the experiments. In Section 2, we continue analyzing the population level difference between Trojaned and clean models, with a focus on the short-cuts. A boundaryoperatorona p-simplextakesallitsadjacent (p 1)-simplices.Inparticular,theboundary of an edges consists of its adjacent nodes; the boundary of a triangle consists of its three edges. More generally, the boundary of ap-chain is the formal sum1 of the boundary of all its elements, (c)= P ฯ c (c). Afterthereduction,thepivoting entries of the reduced matrix correspond to pairs of simplices.
BadGraph: A Backdoor Attack Against Latent Diffusion Model for Text-Guided Graph Generation
Ye, Liang, Chen, Shengqin, Dai, Jiazhu
The rapid progress of graph generation has raised new security concerns, particularly regarding backdoor vulnerabilities. While prior work has explored backdoor attacks in image diffusion and unconditional graph generation, conditional, especially text-guided graph generation remains largely unexamined. This paper proposes BadGraph, a backdoor attack method against latent diffusion models for text-guided graph generation. BadGraph leverages textual triggers to poison training data, covertly implanting backdoors that induce attacker-specified subgraphs during inference when triggers appear, while preserving normal performance on clean inputs. Extensive experiments on four benchmark datasets (PubChem, ChEBI-20, PCDes, MoMu) demonstrate the effectiveness and stealth of the attack: less than 10% poisoning rate can achieves 50% attack success rate, while 24% suffices for over 80% success rate, with negligible performance degradation on benign samples. Ablation studies further reveal that the backdoor is implanted during VAE and diffusion training rather than pretraining. These findings reveal the security vulnerabilities in latent diffusion models of text-guided graph generation, highlight the serious risks in models' applications such as drug discovery and underscore the need for robust defenses against the backdoor attack in such diffusion models.